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Abstract — This paper introduces GODDeS: a fully distributed 
self-organizing decision-theoretic routing algorithm designed to 
effectively exploit high quality paths in lossy ad-hoc wireless 
environments, typically with a large number of nodes. The 
routing problem is modeled as an optimal control problem for a 
decentralized Markov Decision Process, with links characterized 
by locally known packet drop probabilities that either remain 
constant on average or change slowly. The equivalence of this 
optimization problem to that of performance maximization of 
an explicitly constructed probabilistic automata allows us to 
effectively apply the theory of quantitative measures of proba- 
bilistic regular languages, and design a distributed highly efficient 
solution approach that attempts to minimize source-to-sink drop 
probabilities across the network. Theoretical results provide 
rigorous guarantees on global performance, showing that the 
algorithm achieves near-global optimality, in polynomial time. 
It is also argued that GODDeS is significantly congestion-aware, 
and exploits multi-path routes optimally. Theoretical development 
is supported by high-fidelity network simulations. 

Index Terms — Probabilistic Finite State Machines; Language 
Measure; Ad-hoc Routing; Optimal Routing 



I. Introduction & Motivation 

The routing problem has been widely studied in the context 
of ad-hoc wireless networks, and reported algorithms can be 
broadly classified as follows. A routing protocol is pro-active 
(DBF (e.g. Distributed Bellman-Ford) Q] and DSDV (Highly 
Dynamic Destination-Sequenced Distance Vector routing) Q), 
if fresh destination lists and their routes are maintained by 
periodically distributing routing tables; it is reactive (e.g. 
AODV (Ad-hoc On-demand Distance Vector) E] and DSR 
(Dynamic Source Routing) [4]) if routes are computed if 
and when necessary by flooding the network with Route 
Request packets. Pro-active protocols suffer from expensive 
route maintenance and slow reaction to topology changes, 
while reactive methods have high latency in discovery and 
induce congestion due to periodic flooding. Hybrid protocols 
attempt to combine advantages of both philosophies e.g. 
HRPLS (Hybrid Routing Protocol for Large Scale Mobile Ad 
Hoc Networks with Mobile Backbones) and HSLS (Hazy 
Sighted Link State routing protocol) [6|. Protocols may also be 
classified as being either distance-vector or link-state driven. 
In the former case, the computed distance to all nodes is is 
exchanged with neighbors (e.g. DSDV, AODV); while in the 
latter computed distances to the neighbors is exchanged with 
all nodes (e.g. OLSR (Optimized Link State Routing) Q, 
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ZHLS (Zone-Based Hierarchical Link State) Q8J) . Link state 
protocols maintain better Quality Of Service (QOS), but suffer 
from poor scalability. Distance vector protocols have less 
control traffic, but maintaining QOS is more difficult. Other 
approaches use geographic, or power information, and in the 
context of sensor networks, query based routing strategies (e.g. 
Directed Diffusion [9]) have been proposed. 

Reported ad hoc routing protocols for wireless networks 
primarily focus on node mobility, rapidly changing topologies, 
overhead, and scalability; with little attention paid to finding 
high-quality paths in the face of lossy wireless links. An 
implicit assumption is that links either work well or dont work 
at all; which is not reasonable in the wireless case where many 
links have intermediate loss ratios. This problem has been 
partially addressed by designing new quality-aware metrics 
such as the expected transmission count (ETX) iflOl . where 
the authors correctly note "minimizing hop-count maximizes 
the distance traveled by each hop, which is likely to minimize 
signal strength and maximize the loss ratio". Even if the best 
route is one with minimal hop-count, there may be many routes 
(particularly in dense networks) of the same minimum length 
with widely varying qualities; arbitrary choice made by most 
minimum hop-count metrics is not likely to select the best. 
The problem is also crucial in multi-rate networks ATI , where 
the routing protocol must select from the set of available links. 
While in single-rate networks all links are equivalent, in multi- 
rate networks each available link may operate at a different 
rate. Thus the routing protocol is presented with a complex 
trade-off decision: Long distance links take fewer hops, but 
the links operate slower; short links can operate at high rates, 
but more hops are required. 

In this paper, we give a theoretical solution to this poten- 
tially large-scale decision problem via formulating a proba- 
bilistic routing policy that very nearly minimizes the end- 
to-end packet drop probabilities. In particular, the routing 
problem is modeled and solved as an optimal control prob- 
lem for a Decentralized Markov Decision Process (D-MDP). 
Extensively used for centralized decision making in stochastic 
environments, Markov decision processes (MDPs) have been 
recently, extended to decentralized multi-agent settings [12]. 
In the context of ad-hoc routing, we begin by assuming 
that the communication links are imperfect, and are being 
characterized by locally known drop probabilities. The mean 
or expected values of the link-specific drop probabilities, and 
the network topology is assumed to be are either constant 
or changing over a time scale which is significantly slower 
compared to that of the communication dynamics. We then 
seek local routing decisions that maximize throughput in the 
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sense of minimizing the source-to-sink probability of packet- 
drops. The Markov structure emerges, since we assume that 
the local link-specific drop probabilities are independent of the 
history of sequential link traversal by individual packets. 

The results developed in this paper effectively resolve the 
issues described above (and does more, actually attaining near 
global optimality); and would seem to be a straightforward so- 
lution scheme. Nevertheless, to the best of the author's knowl- 
edge, such an approach has not been previously investigated. 
The reason for this apparent neglect (which also highlights 
the key theoretical contribution of this paper) is as follows: 
Recent investigations 1121 . [ 1 3 1 into the solution complexity of 
decentralized Markov decision processes have shown that the 
problem is exceptionally hard even for two agents; illustrating 
a fundamental divide between centralized and decentralized 
control of MDP. In contrast to the centralized approach, 
the decentralized case provably does not admit polynomial- 
time algorithms. Furthermore, assuming EXP = NEXP, the 
problems require super-exponential time to solve in the worst 
case. Such negative results do not preclude the possibility of 
obtaining near-optimal solutions efficiently. This is precisely 
what we achieve in this paper, in the context of the routing 
problem. We show that a highly efficient, fully distributed, 
decision algorithm can be designed that effectively solves the 
distributed MDP such that the control policy, on convergence, 
is within an e bound of the global optimal. Furthermore, one 
can freely choose the error bound e (and make it as small 
as one wishes), with the caveat that the convergence time 
increases (with no finite upper bound) with decreasing e. 

We call this algorithm GODDeS (Globally e-Optimal Rout- 
ing Via Distributed Decision-theoretic Self-organization). In- 
stead of using a standard MDP formulation, we use a problem 
representation based on Probabilistic Finite State Automata 
(PFSA), which allows us to set up the decision problem as that 
of performance maximization of PFSA, and obtain solutions 
using the recently reported quantitative measures of probabilis- 
tic regular languages JT4). This shift of modeling paradigm 
is the quintessential insight that allows one to achieve near- 
global optimality in polynomial time. Theoretical results also 
establish that GODDeS is highly scalable, optimally take 
advantage of existing multi-path routes, and is expected to be 
significantly congestion-aware. For simplicity of exposition, 
a single sink is considered throughout the paper. This is 
not a serious restriction, since the results carry over to the 
general case with ease. The resulting algorithm is both pro- 
active and reactive, but not in the usual sense of reported 
hybrid protocols. It uses both distance-vector (in a generalized 
sense via the language-measure construction) and link-state 
information, and uses local multi-cast to forward messages; 
optimally taking advantage of multi-path routing. 

The rest of the paper is organized in six sections. Sec- 
tion [II] briefly summarizes the theory of quantitative mea- 
sures of probabilistic regular languages, and the pertinent 
approaches to centralized performance maximization of PFSA. 
Section [III] develops the PFSA model of an ad-hoc network, 



the NS2 network simulator, and discusses the key properties 
and characteristics for the proposed routing algorithm. The 



and Section IV presents the key theoretical development for 
decentralized PFSA optimization. Section [V] validates the 
theoretical development with high fidelity simulation results on 



paper is summarized and concluded in Section VII with 
recommendations for future work. 



II. Background: Language Measure Theory 

This section summarizes the concept of signed real mea- 
sure of probabilistic regular languages, and its application in 
performance optimization of probabilistic finite state automata 
(PFSA) (14). A string over an alphabet (i.e. a non-empty finite 
set) E is a finite-length sequence of symbols from E [15|. The 
Kleene closure of E, denoted by E*, is the set of all finite- 
length strings of symbols including the null string e. The string 
xy is the concatenation of strings x and y, and the null string 
e is the identity element of the concatenative monoid. 

Definition 1 (PFSA): A PFSA G over an alphabet E is 
a sextuple (Q, E,<5, IT, x^\ where Q is a set of states, 
5 : Q x E* — > Q is the (possibly partial) transition map; 
IT : Q x E — > [0, 1] is an output mapping, known as the 
probability morph function that specifies the state-specific 
symbol generation probabilities and satisfies Vg; E Q,<r E 
E, IT(<7i, a) Si 0, and 2~2aes nfe, c) = 1, the state character- 
istic function x '• Q [ — li 1] assigns a signed real weight to 
each state, and 'rf is the set of controllable transitions that can 
be disabled (Definition [2}. 

Definition 2 (Control Philosophy): If 5(c[i,cr) = qk, then 
the disabling of a at qi prevents the state transition from qi 
to qk- Thus, disabling a transition it at a state q replaces the 
original transition with a self-loop with identical occurrence 
probability, i.e. we now have 5(qi,a) — qi. Transitions that 
can be so disabled are controllable, and belong to the set ff. 

Definition 3: The language L(qi) generated by a PFSA G 
initialized at the state qi E Q is defined as: = {s E 

E* | S(qi,s) E Q} Similarly, for every qj E Q, L(qi,qj) 
denotes the set of all strings that, starting from the state qi, 
terminate at the state qj, i.e., L(qi, qj) = {s E E* | 8(q,i, s) = 

qj e Q} 

Definition 4 (State Transition Matrix): The state transition 
probability matrix IT E [0, 1] Card (Q) xCard ( c 3), for a 
given PFSA is defined as: Vqi,qj E Q,Hij = 
E ctG e s .t. S( qt ,a)= qj n(cr,%) Note that IT is a square non- 
negative stochastic matrix fl6l . where Ily is the probability 
of transitioning from q L to qj. 

Notation 1: We use matrix notations interchangeably for 
the morph function IT. In particular, ITy = Il^j, cr,-) with 
q t E Q,aj E E. Note that IT E [0, i]Card(q)xCard(S) ifj nQt 
necessarily square, but each row sums up to unity. 
A signed real measure ifTTIl v % : 2 L ^ — > M = (— oo, +oo) is 
constructed on the er-algebra 2 L ( qi ^ lfl4l . implying that every 
singleton string set {s E L(qi)} is a measurable set. 

Definition 5 (Language Measure): Let oj E L(qi,qj) C 
2 L (n) . The signed real measure v\ of every singleton string 
set {uj} is defined as: ^({w}) = 6{1 - 6)\ u \lV(qi,uj)x{q 3 )- 
For every choice of the parameter E (0, 1), the signed 
real measure of a sublanguage L(qi,qj) C L(q.i) is de- 



fined as: is l e (L(qi,qj)) = £ 



S(l-fl)Mn( ft , W ) Xj . 
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Similarly, the measure of L{qi), is defined as v l g (L{qi)) = 

Notation 2: For a given PFSA, we interpret the set of mea- 
sures v l e (L(qi)) as a real-valued vector of length CARD(Q) 
and denote v l g (L(q.i)) as v$\i. 

The language measure can be expressed vectorially: 



0[i-(i-0)n] x 



(i) 



The inverse exists for 9 e (0, 1] |14|. 

Remark 1 (Physical Interpretation): In the limit of 9 — > 
+ , the language measure of singleton strings can be inter- 
preted to be product of the conditional generation probability 
of the string, and the characteristic weight on the terminat- 
ing state. Hence, smaller the characteristic, or smaller the 
probability of generating the string, smaller is its measure. 
Thus, if the characteristic values are chosen to represent the 
control specification, with more positive weights given to more 
desirable states, then the measure represents how good the 
particular string is with respect to the given specification, 
and the given model. The limiting language measure = 
lini0^ o + [I — (1 — 0)n] X\- sums U P the limiting measures 
of each string starting from qi, and thus captures how good 
qi is, based on not only its own characteristic, but on how 
good are the strings generated in future from qi. It is thus a 
quantification of the impact of in a probabilistic sense, on 
future dynamical evolution |[T4l . 

Definition 6 (Supervisor): A supervisor disables a subset 
of the set ^ of controllable transitions and hence there is a 
bijection between the set of all possible supervision policies 
and the power set 2*. 

Language measure allows a quantitative comparison of 
different supervision policies. 

Definition 7 (Optimal Supervision Problem): Given a 
PFSA G = (Q, £, 8, II, x, compute a supervisor disabling 
®* C Sf, s.t. u* ^ (ElementW ise) 4 V^t c V where v*, i/J 
are the limiting measure vectors of supervised plants G*, G^ 
under &* , ffi respectively. 

Remark 2: The solution to the optimal supervision problem 
is obtained in [14| by designing an optimal policy using u$ 
with 9 G (0, 1). To ensure that the computed optimal policy 
coincides with the one for 9 — > + , the authors choose a small, 
but non-zero value for 9 in each iteration step of the design 
algorithm. To address numerical issues, algorithms reported 
in |[l4l computes how small a 9 is actually required, i.e., 
computes the critical lower bound 9+. Moreover the solution 
obtained is optimal, unique, efficiently computable, and max- 
imally permissive among policies with maximal performance. 
Language-measure-theoretic optimization is not a search 
based approach. It is an iterative sequence of combinatorial 
manipulations, that monotonically improves the measures, 
leading to element-wise maximization of vq (See lfl4ll ). It is 
shown in (HI that 



\\m + 6[I - (1 - 0)11] 1 x = &X 



(2) 



where the i th row of 3? (denoted as p 1 ) is the stationary 
probability vector for the PFSA initialized at state qi. In other 
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Fig. 1. Node centric decision for packet forwarding with non-zero drop 
probability for all choices 



words, & is the Cesaro limit of the stochastic matrix LI, 
satisfying & = lim^ £*L U k HE). 

Proposition 1 (See [14]): Since the optimization maxi- 
mizes the language measure element-wise for 9 —> + , it 
follows that for the optimally supervised plant, the standard 
inner product (p 1 , x) is maximized, irrespective of the starting 
state qi 6 Q. 

Notation 3: The optimal ^-dependent measure for a PFSA 
is denoted as and the limiting measure as v*. 

III. Modeling Ad-hoc Networks as PFSA 

We consider an ad-hoc network of communicating nodes 
endowed with limited computational resources. For simplicity 
of exposition, we develop the theoretical results under the 
assumption of a single sink. This is not a serious restriction 
and can be easily relaxed. The location and identity of the 
sink is not known a priori to the individual nodes. Inter-node 
communication links are assumed to be imperfect, with the 
possibility of packet drop in each transmission attempt. We 
assume nodes can efficiently gather the following information: 
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1) (Set of Neighboring Nodes:) Number and unique id. of 
nodes to which it can successfully send data via a 1-hop 
direct link. 

2) (Local Link Properties:) Link-specific probability of 
packet drop for one-way communication to a specific 
neighbor. 

We further assume that the link-specific packet drop proba- 
bilities are either constant, or change slowly enough, making 
it possible to treat them locally as time-invariant constants 
for route optimization. Note that this does not imply that the 
network topology is assumed to be static; we only require that 
the packet-drop probability for communication from any given 
node to a particular neighbor <y be more or less constant, 
say 0.7. Thus qi may choose not to send data to <y all the time, 
but when it does, then, on the average, 70% of the packets 
get dropped. In practice, the packet drop probabilities may 
vary with current network condition, e.g. congestion leading 
to buffer overflow at specific nodes or (in the context of sensor 
networks) high-traffic nodes running out of power. We do not 
consider these effects in detail; however we briefly describe 
strategies to handle such effects via simple modifications of 
the basic principles laid out under the assumption of con- 
stant drop probabilities. Specific applications, such as wireless 
sensor networks, require routing schemes that in addition to 
throughput, are aware of energy and power issues. Also, data- 
priority need to be respected to enable context-aware routing. 

First we formalize the modeling of an ad-hoc network as a 
probabilistic finite state automata. 

Definition 8 (Neighbor Map): If Q is the set of all nodes 
in the network, then the neighbor map TV : Q — >• 2^ specifies, 
for each node qi e Q, the set of nodes Af(qi) C Q (excluding 
qi) to which q. L can communicate via a single hop direct link. 

Definition 9 (Packet Drop Probability): The link specific 
packet drop probability Ay € [0, 1] is defined to be the limiting 
ratio of the number of packets dropped to the total number of 
packets sent, in communicating from node qi to node qj. 

Note that the drop probabilities are not constrained to be 
symmetric in general, i.e., Ay ^ Xji. Also, note that we 
assume the node-based estimation of these ratios to converge 
fast enough. We visualize the local network around a node qo 
in a manner illustrated in Figure [TJa) (shown for two neighbors 
qi and q2). In particular, any packet transmitted from qo for q\ 
has a drop probability Aoi, and the ones transmitted to q^ have 
a drop probability Ao2- To correctly represent this information, 
we require the notion of virtual nodes (gji, <7o2 m Figure|TJb)). 

Definition 10 (Virtual Node): Given a node q i7 and a neigh- 
bor qj E J\f(qi) with a specified drop probability Ay, any 
transmitted data-packet from qi for <y is assumed to be 
first delivered to a virtual node q\j, upon which there is 
either an automatic (i.e. uncontrollable) forwarding to qj with 
probability 1 — Ay, or a drop with probability Ay. The set of 
all virtual nodes in a network of Q nodes is denoted by Q v 
in the sequel. 

Hence, the total number of virtual nodes is given by: 




A 6-node Network 



^Controllable ^ 



PFSA model with virtual nodes 

Fig. 2. 6-node Network and PFSA model with 23 states (16 virtual nodes, 
6 nodes, 1 dump state) 



And the cardinality of the set of virtual nodes satisfies: 

^ Card(Q u ) ^ Card(Q) 2 - Card(Q) (4) 

We are ready to model an ad-hoc network as a PFSA. 

Definition 11 (PFSA Model of Network): For a given set 
of nodes Q, the function J\f : Q —> 2®, the link spe- 
cific drop probabilities Ay for any node qi and a neighbor 
qj G Af(qi), and a specified sink qsinn e Q, the PFSA 
<G N = (Q N , S, 8, n, x, is defined to be a model of the 
network, where (denoting CARD(7V(gi)) = m): 

o States: Q N = Q\jQ v \J {qo} 

where Q v is the set of virtual nodes, and qo is a dump state 
which models packet loss. For the alphabet E: 



o Alphabet: 




u 



uw 



(Jij denotes transmission (attempted or actual) from qi to qj, 
and an denotes transmission to qr> (packet loss). 



o Transition 
Map: 



8{q,a) = < 



lij 
Qo 
to 
to 



if q = q i} a = a. 
if q 
if q 



Hi,* 



Qiv a = ° D 
if q = q D ,<T = a D 

undefined otherwise 



o Probability ~ 
Morph n(g,cr) 
Matrix: 



o Characteristic 
Weights: 

o Controllable 
Transitions: 



1 - Ay 
Ay 
1 


1 





if q 
if q 
if q 

if q = q Dl a 
otherwise 



<?y,<7 



i j 



if q% = qsink 

otherwise 



We note that for a network of Q nodes, the PFSA model may 
have (almost always has, see Figure [2ji a significantly larger 
number of states. Using Eq. Q: 



Card(Q") = ^(®) 



(3) 



Card(Q ) = Card(Q) + Card((5 u ) + l (6) 

card(q) + i <; card(q w ) <; card(q) 2 + 1 (7) 
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This state-explosion will not be a problem for the distributed 
approach developed in the sequel, since we use the com- 
plete model Gn only for the purpose of deriving theoretical 



guarantees. Note, that Definition 1 1 generates a PFSA model 
which can be optimized in a straightforward manner using the 
language-measure-theoretic technique described in Section [II] 
(See [14|) for details). This would yield the optimal routing 
policy in terms of the disabling decisions at each node that 
minimize source-to-sink drop probabilities (from every node 
in the network). To see this explicitly, note that the measure- 
theoretic approach elementwise maximizes \im g ^ + (1 — 

0)H] X = £?Xi where the i th row of & (denoted as p l ) 
is the stationary probability vector for the PFSA initialized 
at state qi (See Proposition [TJ. Since, the dump state has 
characteristic —1, the sink has characteristic 1, and all other 
nodes have characteristic 0, it follows that this optimization 
maximizes the quantity Ps, NK — Pdump' f° r everv source state 
or node qi in the network. Note that Ps, NK , Pdump are the 
stationary probabilities of reaching the sink and incurring a 
packet loss to dump respectively, from a given source Thus, 
maximizing ps, NK — Pdump f° r ever Y Qi £ Q guarantees that 
the computed routing policy is indeed optimal in the stated 
sense. However, the procedure in |[T4l requires centralized 
computations, which is precisely what we wish to avoid. 
The key technical contribution in this paper is to develop 
a distributed approach to language-measure-theoretic PFSA 
optimization. In effect, the theoretical development in the next 
section allows us to carry out the language-measure-theoretic 
optimization of a given PFSA, in situations where we do not 
have access to the complete IT matrix, or the X vector at any 
particular node (i.e. each node has a limited local view of 
the network), and are restricted to communicate only with 
immediate neighbors. We are interested in not just computing 
the measure vector in a distributed manner, but optimizing the 
PFSA via selected disabling of controllable transitions (See 
Section [II} . This is accomplished by Algorithm [TJ 

Before we embark up on the detailed analysis of Algo- 
rithm [T] in the next section, we briefly elucidate the connection 
with decentralized Markov Decision Processes. The PFSA 
based modeling framework is somewhat different from the 
standard MDP architecture. For example, in contrast to the 
latter, our actions are "controllable" transitions, and have 
probabilities associated with them. Rewards and penalties are 
not associated with individual actions, but with state visitations 
(and modeled via the characteristic weights). We maximize the 
long term or expected reward by maximizing the probability 
of reaching the sink, while simultaneously minimizing the 
probability of reaching the dump state, i.e., a drop, from any 
arbitrary node in the network. More details on relations to the 
standard approach is given in ifTHl . 



IV. Decentralized PFSA Optimization 

Notation 4: In the sequel, the current measure value, for a 
given 9, at node qi E Q is denoted as v e \i, and the measure 



of the virtual node qVj g Q is denoted as v 



The 



parenthesized entry (qjj) denotes the index of the virtual node 
in the state set Q . Similarly, the transition probability 



Algorithm 1: Distributed Update of Node Measures 

input : G N = (Q, S, 5, LT, X , 6 
begin 

Initialize Vqi £ Q,v 9 \i — 

/* Begin Infinite Asynchronous Loop */ 
while true do 

for each node qi £ Q do 
if Affa) / then 
m = CARD(A%i)) 
for each node qj £ M{qi) do 

/* (a1) Internode Communication */ 

Query V g \j & Drop Prob. A;j 

/* (a2) Control Adaptation */ 
if 9 g \j < 9 g \i then 



II; 

n 



i(lii> 



= 0; /* Disable 



else 



if n. 



n,. 



then 



— Ilii 

endif 
endif 



/* Enable */ 



/* (a3) Updating Virtual Nodes */ 

9 e\ {q V) = (1- - K)ve\j 

endfor 
endif 

/* (a4) Updating Node */ 

»e\i = J2 ~ ) n i(<jg) p 9l(9g) 

j:<y£AA(<ji) 

endfor 
endw 



end 



from qi to q^ is denoted as U^yy The subscript entry i(qYj) 
denotes the ik th element of LT, where k — (qjj). 

Algorithm [TJ establishes a distributed, asynchronous update 
procedure which achieves the following: 

global 



Vqi e Q,ve\ 



(8) 



convergence 

where v g \i is the optimal measure for q$ E Q that would 
be obtained by optimizing the PFSA Gn, for a given 9, in 
a centralized approach (See Section [H}. The optimal routing 
policy can then be obtained by forwarding packets to neigh- 
boring nodes which have a better or equal current measure 
value. If more than a one such neighbor is available, then one 
chooses the forwarding node randomly, in an equiprobable 
manner. In fact, the nodes need not wait for exact convergence; 
in the sequel we show that this forwarding policy converges 
to the globally optimal routing policy, that, for a sufficiently 
small 9, it maximizes probability of reaching the sink, while 
simultaneously minimizing the probability of packet drops. 
Furthermore, choosing randomly between qualifying neigh- 
boring nodes leads to significant congestion resilience. These 
issues would be elaborated in the sequel (Proposition [7}. First, 
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Algorithm [T] is analyzed to establish convergence. 

Algorithm [T] has four distinct parts, marked as (al), (a2), 
(a3) and (a4). Part (al) involves intemode communication, 
to enable a particular node G Q to ascertain the current 
measure values of neighboring nodes, and the drop proba- 
bilities Xij on respective links. Recall, that we assume the 
probabilities Ay to be more or less constant; nevertheless 
nodes estimate these values to adapt to changing (albeit 
slowly) network conditions. Part (a2) is the control adaptation, 
in which the nodes decide, based on local information, the 
set of forwarding nodes. Part (a3) is the computation of 
the updated measure values for the virtual nodes where 
j : qj € N{qi)- Finally, part (a4) updates the measure of the 
node qi based on the computed current measures of the virtual 
nodes. We note that Algorithm [T] only uses information that 
is either available locally, or that which can be queried from 
neighboring nodes. 

Proposition 2 (Convergence): For a network Q modeled as 
a PFSA G N = (Q N ,'£,8,Tl,x,'&), the distributed procedure 
in Algorithm [T] has the following properties: 

1) Computed measure values for every node ^ G Q are 
non-negative and bounded above by 1, i.e., 



V ft e Q N yt e [0,oo), v\\i e [0,1] 



(9) 



2) For constant drop probabilities and constant neighbor 
map TV : Q -> 2 Q , Algorithm converges in the sense: 



\% G Q N , lim = G [0, 1] 

t— too 



(10) 



3) Convergent measure values coincide with the optimal 
values computed by the centralized approach: 



Vq l eQ N ,vrU = v*e\ 



(11) 



Proof: (Statement 1:) Non-negativity of the measure 
values is obvious. For establishing the upper bound, we use 
induction on computation time t. We note that all the measure 
values Vg\i are initialized to at time t = 0. The first node to 
change its measure will be the sink, which is updated at some 
time t = t : 



VlfasiNlc) - + 9 X(q Slm ) 



(12) 



where the first term is zero since all nodes still have measure 
zero and the sink characteristic X( 9SlNK ) = 1. Thus, there exists 
a non-trivial time instant to, at which: 

(Induction Basis) Vq, G Q N ,ve\i ^ 1 (13) 

Next we assume for time t = t' ', we have 

(Induction Hypothesis) V<7, G Q N , W ^ t' ,t% |, ^ 1 

We consider the next updates for physical nodes and virtual 
nodes separately, and denote the time instant for the next 
updates as t' + . Note, that t' + actually may be different for 
different nodes (asynchronous operation). 
(Virtual Nodes) For any virtual node = G Q N , where 
Qk: Qj € Q, we have: 



= (1-A ij -)(1- 



y \3 



< 1 



(14) 



(Physical Nodes) For any qi G Q, where set of enabled 



neighbors E n = {qj G jV(qi) s.t. 



Card (Af(qi)) 



]T (l-^a-Ay)^'!^) 



E 



(1 - B)vt 



< 1 ( y i 

= CARD(Af( qi )) \ ^ 



E 1 1 - 1 



which establishes Statement 1. 

(Statement 2:) We claim that for each node qi £ Q , 
the sequence of measures v\\i forms a monotonically non- 
decreasing sequence as a function of the computation time t. 
Again, we use induction on computation time. Considering the 
time instant to (See Eqn. (p~2]>), we note that we have an instant 
up to which all measure values have indeed changed in a non- 
decreasing fashion, since the measure of ^sink increased to 6, 
while other nodes are still at 0; which establishes the basis. 
For our hypothesis, we assume that there exists some time 
instant t' > to, such that all measure values have undergone 
non-decreasing updates up to t'. We consider the physical node 
qi G Q which is the first one to update next, say at the instant 
t' + > t! . Referring to Algorithm [TJ this update occurs by first 
updating the set of virtual nodes {q^ : qj G jV(qi)}. Since 
virtual nodes update as: 



{ii, ~ ^ ~ "A- 1 ~ 'we 13 (15) 
it follows from the induction hypothesis that 

u+\ (qW ^ui'\ K) (16) 

If the connectivity (i.e. the forwarding decisions) for the 
physical node remains unchanged for the instants t' and 
t' + , and since the measures of any neighboring node has not 
decreased (by induction hypothesis), then: 



o H 



(17) 



If, on the other hand, the set of disabled transitions for q, 
changes (e.g. for some qj G j\f(qi), qi q^j was disablec 
at t' and is enabled at t' + , or vice verse), the measure of node q 



(1-9) 



is increased by the additive factor CAm7A?7<rT) 
which completes the inductive process and establishes our 
claim that the measure values form a non-decreasing sequence 
for each node as a function of the computation time. Since, 
a non-decreasing bounded sequence in a complete space must 
converge to a unique limit ifTTIl . the convergence: 

V?i G Q N 



lim vg\i 



G [0, 1] 



(18) 



follows from the existence of the upper bound established in 
Statement 1. This establishes Statement 2. 
(Statement 3:) From the update equations in Algorithm [T] we 
note that the limiting measure values satisfy: 



(1-6) 



E n v 9 e 
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=*.5g° = 0[i-(i-0)n] ' x (19) 

which implies that measure values does indeed converge to 
the measure vector computed in a centralized fashion (See 
Eq. ([TJ). Noting that any further disabling (or re-enabling) 
would not increase the measure values computed by Algo- 
rithm[T] we conclude that this must be the optimal disabling set 
that would be obtained by the centralized language-measure 
theoretic optimization of PFSA Gn (Section [II]). This com- 
pletes the proof. ■ 
Proposition 3 (Initialization Independence): For a network 
Q modeled as a PFSA G N = (Q N , S, 8, II, x, convergence 
of Algorithm [T] is independent of the initialization of the 
measure values, i.e., if v l 6 denotes the measure vector at 
time t with arbitrary initialization a € [0, i] Card (<2 then: 

lim v\ a = lim v\ (20) 

t— >oo ' t— foo 

where dg = a and Vg — [0 • • • 0] T . 

Proof: The measure update equations in Algorithm [T] 
dictate that the measure values will have a positive contribution 
from a. Denoting the contribution of a to the measure of node 
qi G Q at time t as C^qi), we note that the measure can be 
written as Dg — C l a (qi) + ff, where /* is independent of a. 
Furthermore, the linearity of the updates imply that C£,(<Zi) can 
be used to formulate an inductive argument as follows. We use 
fc* g NU{0} to denote the minimum number of updates that 
every node in the network has encountered up to time instant 
t € [0, oo). We claim that: 

V qi eQ,Vte [0,oo),^fe) ^ (1 - #) fc *IMIi (21) 

To establish this claim, we use induction on fc*. For the basis, 
we note that there exists a time instant to, such that Vt ^ 
to,kl = 0, implying that 

\/r^t ,C r M = Oi^ (1-0)° %- = (l-0) fe *N|i 

?J6Q 

We assume that if at some tk, k^ k = k € N, then: 

(Induction Hypothesis) Vg 4 e Q, ^ (1 - 0) fe ||a||i 

Next let G Q be an arbitrary physical node, and we consider 
the first update of qi at tt > tk'. 

$\i =£ (! ~ Wko^* Ik) + (i - »)n**^*l< + ^ 
=> c*i( qi ) sJ2 (i ~ ^n^oC 1 - A «)(! - - ^) fe H«lli 

+ {l-0)X\ ll {l-6) k \\a\\ 1 + 6 Xl 
^C'hqi) ^ {\.-B) k+1 \\a\\x 

We note that if /c* fc+1 = fc + 1, then every node E Q must 
have undergone one more update since tk implying: 

VfteQ,C**+ l («h)^(l-e) fc+1 ||a||i (22) 

which completes the induction proving Eq. (J2T|. Observing 
that lim^oo fc' = oo, and ||a||i < oo, we conclude: 

\/q t e Q, lim C*(&) = (23) 



which immediately implies Eq. ( |20| >. ■ 

Next we investigate the performance of the proposed 
approach, and establish guarantees on global performance 
achieved via local decisions dictated by Algorithm [T] We need 
some technical lemmas, and the notion of strongly absorbing 
graphs, and graph powers. 

Definition 12 (Exact Power of Graph): For a given graph 
G = (V, E), the exact power G d , for d € N, is a graph 
(V, E'), such that (q,^ qj) is an edge in G d , only if there exists 
a sequence of edges of length exactly d from node to node 
qj in G. 

Definition 13 (Strongly Absorbing Graph): A finite 
directed graph G = (V, E) (V is the set of nodes and 
E C V x V the set of edges) is defined to be strongly 
absorbing (SA), if: 

1) There are one or more absorbing nodes, i.e., 3A C y, 
s.t. every node in A (non-empty) is absorbing. 

2) There exists at least one sequence of edges from any 
node to one of the absorbing nodes in A. 

3) If E d denotes the set of edges for the d th exact power 
of G, then, for distinct nodes ^ , qj G V , 

(q,, qj) eE^WeN, (q v q t ) i E d (24) 

Lemma 1 (Properties of SA Graphs): Given a SA graph 
G = (V,E), with A C V the absorbing set: 

1) The power graph G d is SA for every d £ N. 

2) qi A^3q'eV\{q} s.t. (q',q)?E 

3) 3deN(VqeV\A (3q' e A ((q, q') e E d ))) 
Proof: Statement 1 is immediate from Definition [13] 

Statement 2 follows immediately from noting: 

qi A 3q' e V \ {q} s.t. (g, q') G £ => (g', g) ^ E 

Statement 3 follows, since from each node there is a path 
(length bounded by CARD(V)) to a absorbing state. ■ 
The performance of such control policies, and particularly the 
convergence time-complexity is closely related to the spectral 
gap of the induced Markov Chains. Hence we need to compute 
lower bounds on the spectral gap of the chains arising in 
the context of the proposed optimization, which (as we shall 
see later) have the strongly absorbing property. The following 
result computes such a bound as a simple function of the non- 
unity diagonal entries of n. 

Proposition 4 (Spectral Bound): Given a n-state PFSA 
G = (Q, E,i5, II) with a strongly absorbing graph, the mag- 
nitude of non-unity eigenvalues of the transition matrix II is 
bounded above by the maximum non-unity diagonal entry of 

n. 

Proof: Without loss of generality, we assume that G 
has a single absorbing state (distinct absorbing states can be 
merged without affecting non-unity eigenvalues). Now, /i is an 
eigenvalue of II iff /i d is an eigenvalue of U d , d E N. From 
Lemma [T] 

CI 3<eN s.t. H e has no zero entry in column corresponding 
to the absorbing state. Let be the smallest such integer. 

C2 Every non-absorbing state has at least one zero element 
in the corresponding column of II d * . 

C3 Statements C1,C2 are true for any integer d ^ d*. 



8 



We denote the column of ones as e, i.e., e = [1 • • • 1] T Since 
IT d is (row) stochastic, we have II d e = e. Hence, if v is a left 
eigenvector for H d with eigenvalue p d , then: 



vll e = tie 



p d ve 



(1 - p d )ve = 



(25) 



implying that if p d ^ 1, then ve = 0. Now we construct 
C = [Ci ■ ■ ■ C n ], where Cj = mhy Ilf^ (minimum column 
element). Considering the matrix M — U d — eC, we note: 



(vll d = p d v) A (p* ^ 1) => vM = n d v 



(26) 



Recalling that stationary probability vectors (Perron vectors) 
of stochastic matrices add up to unity, we have: 



(vll d = v)=> vM = v- veC =v-G 



(27) 



which, along with the fact that since C is not a column of 
all zeros, implies that an upper bound on the magnitudes 
of the eigenvalues of M provides an upper bound on the 
magnitude of non-unity eigenvalues for H d . Now, invoking 
the Gerschgorin Circle Theorem [19], l20l . we get: 

\p d \ ^ 1 - Cj = 1 - C a =► |/x| S (1 - C a Y (28) 



where C a is the minimum column element corresponding to 
the absorbing state. 1 — C a is the maximum probability of not 
reaching the absorbing state after d steps from any state, which 
is bounded above by (a) dl (b) d ~ dl where a is the maximum 
non-diagonal entry in n not going to the absorbing state, b 
is the maximum of the non-unity diagonal entries in II, and 
di is a bounded integer. Since any sequence of non-selfloops 
is absorbed in a finite number of steps (strongly absorbing 
property), we have a finite bound for d\. Hence we have: 



\p\ ^ lim a d b 1 

d— > oo 



= b = 



max 

qi-.TlnKl 



(29) 



This completes the proof. ■ 
Next, we make rigorous our notion of policy performance, and 
near-global or e-optimality. 

Definition 14 (Policy Performance & e-Optimality): The 
performance vector p s of a given routing policy S is the 
vector of node-specific probabilities of a packet eventually 
reaching the sink. A policy U has Utopian performance if its 
performance vector (denoted as p u ) element-wise dominates 
the one for any arbitrary policy S, i.e. Vg.j <G Q N ,pf Si pf. 
A policy P has e-optimal performance, if for some given 
e > 0, we have: 



(30) 



For a chosen 8, the limiting policy Pg computed by 
Algorithm [T] results in element-wise maximization of the 
measure vector over all possible supervision policies (where 
supervision is to be understood in the sense of the defined 
control philosophy), is related to the policy performance 
vector p Pe as follows. Selective disabling of the transitions 
dictated by the policy Pg induces a controlled PFSA, which 
represents the optimally supervised network, for a given 8. Let 
the transition matrix for this optimized PFSA be ITg, and its 



Cesaro limit be 3^g. (Note: ITg, ^g are stochastic matrices.) 
Then: 



v qi e Q N , nx 



*,(9sink) 



Pi 



(31) 



In the sequel, we would need to distinguish between the 
optimal measure vector (optimal for a given 8 = 8') 
computed by Algorithm [T[ and the one obtained by first 
computing and then using the PFSA structure obtained 
in the process to compute the measure vector for some other 
value of 8 = 8". These two vectors may not be identical. 

Notation 5: In the sequel, we denote the vector obtained in 
the latter case as g „~. implying that we have Vuj g\ = vf- 

Lemma 2: We have the following equalities: 



lim vfS, „, = p p <>' 



lim 

0-S-O+ 



'(8,6) 



P 



(32a) 
(32b) 



Proof: Recalling Eq. ( |3T| i, and noting that for any PFSA 
with transition matrix II (with Cesaro limit 3P~), we have 

lim e _j. + vg = lirri0_K)+ 0[l ~ (1 ~ #)n] X = &X, we have 
Eq. P2a| >. In general, different choices of 8 result in different 
disabling decisions, and hence different policies. However, 
since there is at most a finite number of distinct policies 
for a finite network, there must exist a 8* such that for all 
choices < 8 ^ 6+, the policy remains unaltered (although the 
measure values may differ). Since, executing the optimization 
with vanishingly small 8 yields a performance vector identical 
(in the limit) with the optimal measure vector element-wise 
dominating the one for any arbitrary policy, the policy obtained 
for < 8 ^ 8+ has Utopian performance. Hence: 



lim v, 

0->-O+ 



(6,8) 



lim v, 

9-S-0+ 



(8*,8) 



p 



(33) 



This completes the proof. ■ 
Computation of the critical 8* is non-trivial from a distributed 
perspective, although centralized approaches have been re- 
ported lfl4l . Thus it is hard to guarantee Utopian performance 
in Algorithm [T] Also, 8* may be too small resulting in an 
unacceptably poor convergence rate. Nevertheless, we will 
show that, given any e > 0, one can choose 8 to guarantee 
e-optimal performance of the limiting policy in the sense of 



Definition 14 We would need the following result. 



Lemma 3: Given any PFSA, with transition matrix n and 
corresponding Cesaro limit !P, and p being a non-unity 
eigenvalue of n with maximal magnitude, we have: 

||fl[i-(i-fl)n] _1 -^ 



< 



'(8,8) 



lim v, 

6'^a+ 



(8,8') 



- 1 

Ml, 



(34a) 



< 



(34b) 



Proof: Denoting M = [l — (1 — 0)Il] 



M 



(1 - 8)!!}- 1 - 0>J2{1 - 8f 



k=0 



= £(i-0) fc (n- 



k=0 



=pr-(i-0)(n 
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We note, that if u is a left eigenvector of II with unity eigen- 
value, then u3? — u. Also, if the eigenvalue corresponding 
to u is strictly within the unit circle, then u& = 0. After 
a little algebra, it follows that if u is the left eigenspace 
(denoted as ^(l)) corresponding to unity eigenvalues of II, 
then uM = 0, otherwise, uM = — ,} „, u, where u is a 
non-unity eigenvalue for II. Invoking the definition of induced 
matrix norms, and noting H^Hoo = ||A T ||i for any square 
matrix A: 

\\M\\ oa = max ||uM||i= max \\uM\\ x (35) 

||u||l=l ||u||i = lAug.E(l) 

We further note that since [I— (1 — 8)(H — is guaranteed 

to be invertible lfl4ll . its eigenvectors form a basis, implying: 



with 



Ej C 3 u3 



1 



where v? are eigenvectors of [I— (1 — #)(II— 3P)\~ X with non- 
unity eigenvalues, and Cj are complex coefficients. An upper 
bound for ||M||i can be now computed as: 



I Ml 



< 



i-(i-e)\n\ 



E 



< 



where /i is a non-unity eigenvalue for II with maximal 
magnitude. This establishes Eq. d34a). Finally, noting: 



lim = (6{I - (1 - - 9) X 



e'->o+ 



establishes Eq. ( j34b| i. ■ 

The next proposition the key result relating a specific choice 
of 8 to guaranteed e-optimal performance. 

Proposition 5 (Global e-Optimality): Given any e > 0, 
choosing 8 = e/m 2 where m = max ge g CARD(A/"(q)) 
guarantees that the limiting policy computed by Algorithm [T] 
is e-optimal in the sense of Definition [14] 

Proof: We observe that the limiting measure values 
Vg°\i — v* g \i computed by Algorithm [T] can be represented 
by convergent sums of the form (ay are non-negative reals): 



V<& e Q 



N poo. 



where f3g = ,\ , , with u being a maximal non-unity 

1 I Ml 

eigenvalue of the transition matrix of the PFSA computed by 
Algorithm [T] at 8. Next we claim: 



We (0,8], 



< m 2 8 



(39) 



We observe that, for any 8', the optimal policy Pg> can be 
obtained by beginning with the PFSA induced by Pg (which 
is the optimal policy at 8), and then executing the centralized 
iterative approach (14), resulting in a sequence of element-wise 
non-decreasing measure vectors converging to the optimal V^: 



uf) > „W > „M > 



[**] 



v a , 



(40) 



where v3 is the vector obtained after the k th iteration, and 



(36) k* < oo is the number of required iterations. Since, i/L 



[fc] 



9' [l — (1 — 6'')Il[' £ ]] x< where the transition matrix after k 
iteration is II^ and setting A 



^(6 e 1 ) we nave: 



A [k] 



tt[0]^oo 

11 > v (e,6') 



(i - 6»') [i - (i - 6i')n [fc] ] (n [fe] 
= ^ {e' [i - (i - e')u^] (!#] - nw )^ e>) } 



3 [fc] 

T (0->fc) 



For (/i e Q, let HT;" - " 6- ' be the set of transitions (a, — >• f/j), 
which are updated (i.e. enabled if disabled or vice verse) to 
go from the configuration corresponding to vf) to the one 
corresponding to uL. We note that: 



U 



(0->fc) 



where 



= U 



(0->fc) 



U 



(o- 



nu 



(0->fc) 



u 



(o- 



■D nU (o^) 



The i th row 



of III 1 ! is obtained from IF ! Ifl4ll by disabling controllable 
transitions Oj -H> gj if I'l/'lj > vfi \% (and enabling otherwise), 
and each such update leads to a positive contribution in 
the corresponding row of wl, . It follows that updating any 
transition t = (q. k q.j) G NUf° ' 



(37) positive contribution to given by 



flu| M ) leads to a 



implying that for each € Q, J^VJi (See Notation Bj) is a 
monotonically decreasing function of 8i in the domain[0, 8]. 
We note that if the following statement: 



'6 \3 



Vquqj G Q^, 



> v (e,ei)\j 



is true, then we have Utopian performance for policy Pg, i.e., 



P 



p u . Hence, if p Pe ^ p u , then we must have: 



38 2 < 8,3qi,q 3 - e Q 



N 



> v, 



(e,0i)\j/ 



upon which Eq. ( (37| ), along with the bound established in 
Eq. ( |34a| ), guarantees that if , are nodes (in consecutive 
order) that satisfy the above statement, then: 



lim I vy a a 



(e,6/i)li 



< 



/3e 



(38) 



C t = n(q t ,a) 



,[0]| 



,[0]| 



(41) 



and for every transition = (qi — > qk) <E W leads to a 
negative contribution to wj^'lj, given by: 

(42) 
(43) 



[0]| 



implying that: Wg, 



Cf = -II(g i ,a') \ v gl 



^ ^ g( % , <t)^^ = /3 e (See Eq. ([38)) 



Since the rows corresponding to the absorbing states have no 
controllable transitions, absorbing states must remain absorb- 
ing through out the iterative sequence, and the corresponding 

k*} are strictly 0. It follows: 



entries in uj l a 7 for all k S {0, 
I 



, if qi is absorbing 
£[0,f3g8] , otherwise 



(44) 
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Stochasticity of B^ 1 implies that in the limit 0' -> 0+, B^' 1 
converges to the Cesaro limit of B^'. Applying Lemma [ij 



lim b£ ] | 



< 



i - w 



where ^ is a non-unity eigenvalue for W s , with maximal 
magnitude. Using the invariance of the absorbing state set, and 
observing that the Cesaro limit liiri0/_ >o + b£' has strictly zero 
columns corresponding to non-absorbing states, we conclude: 

W G (0,6], A l e % Z l^p 8 ,e'p 8 e Z (3 e ,f3 e 9 
It is easy to see that the PFSA induced by Pg is strongly 



(45) 



absorbing (Definition 13 1, and so is each one obtained in the 
iteration. Also, the virtual nodes in our network model have 
no controllable transitions, and have no self-loops. Physical 
nodes can have self-loops arising from disablings; but for a 
non-absorbing node with at most m neighbors, the self-loop 
probability is bounded by (m — l)/m, which then implies 
Pe'iPe = i-(m-i)/m = m (Proposition |?|. Hence: 

V0'g(O,0], IIA^IU^m 2 ^ 
Thus, if we choose = e/m 2 , we can argue: 



(46) 



Vfc g {0, 



lim 



lim Ug 

)'->0+ 



,&*}, W G (0,0], ||A 
< e 



Wi 



< 



— lim v 



(9.8') 



< 



( Continuity ) 
V of norm / 



< 



(Using Lemma[2]) 



which completes the proof. ■ 
Once we have guaranteed convergence to a e-optimal pol- 
icy, we need to compute asymptotic bounds on the time- 
complexity of route convergence, i.e., how long it takes to 
converge to the limiting policy so that the local routing 
decisions no longer fluctuate. In practice, the convergence time 
is dependent on the network delays, the degree to which the 
node updates are synchronized etc., and is difficult to estimate. 
In this paper, we neglect such effects to obtain an asymptotic 
estimate in the perfect situation. This allows us to quantify the 
dependence of the convergence time on key parameters such 
as N, m and e. Future work will address situations where 
such possibly implementation-dependent effects are explicitly 
considered resulting in potentially smaller convergence rates. 

Proposition 6 (Asymptotic Runtime Complexity): With no 
communication delays and assuming synchronized updates, 
convergence time T c to e-optimal operation for a network of 
N physical nodes and maximum m neighbors, satisfies: 

, Nm 2 

T r = 



,e(l-7*), 

where 7* is a lower bound on drop probabilities 

Proof: Synchronized updates imply that we can assume 
the measure vector to update via the following recursion: 



v g = (Zero vector) 

f +1] = (i-e)u^ + e x 



(47a) 
(47b) 



which can be used to obtain the upper bound: 



< 



(I- 



(48) 



implying that after k updates, each node is within (1— 6) k of its 
limiting value. Denoting the smallest difference of measures 
as A*, we note that (1 — 0) fc ^ A* would guarantee that 
no further route fluctuation occurs, and the network operation 
will be e-optimal from that point onwards. To estimate A*, we 
note that 1) comparisons cannot be made for values closer than 
the machine precision Mo, and 2) the lowest possible non-zero 
measure in the network occurs at the network boundaries if we 
assume the worst case scenario in which the drop probability 
is always 7*. We recall the measure of a node is the sum of 
the measures of all paths initiating from the particular node 
and terminating at the sink. Also, note that any such path 
accumulates a multiplicative factor of (1 — 0) 2 (1 — 7*) in each 
hop. Assuming the worst case, where a given node is N hops 
away, and has a single path to the sink, we conclude that the 
smallest non-zero measure of any node is bounded below by 
((1 — 0) 2 (1 — 7*))^, inducing the following bound: 



A*^M o ((l-0) 2 (l-7*)) 



N 



(49) 



and hence a sufficient condition for convergence is: 

(l-0) fc = Af o ((l-0) 2 (l- 7 ,)) Ar 
=s> (l_ fl )(fc-a^) = Mo(l- 7 *) JV 

log(l - 0) log(l - 0) 

Treating Mq as a constant, we have ^"(^g) = O (|). Since 
must be small for near-optimal operation and considering 
the worst case 7* -C 1, we have: 



(1 - 0) fel =1-7* where ki = 



log(l -7*) 
log(l - 0) 



=*(i - fci0) - 1 - 7* => he = i,^h 
=>/c = o ( n + ; + — r ) = o 



7* 



l-tl-T*))" 1 ^* 1 °G(i-7*) 

1 N \ „ / N 



>k = O 



Nm 1 



<1 - 7*) 



1-7*); ~V0(!-7*) 

(Using Proposition [5]) 



Under the assumption of no communication delay, we have 
T c = 0(k), which completes the proof. ■ 
It follows from Proposition [6] that for constant e and 7*, 
and large networks with relatively smaller number of local 
neighbors such that N m, we will have T c = O(N). 
Detailed simulation, on the other hand, indicates that this 
bound is not tight, as illustrated in Figure ^a), where we 
see a logarithmic dependence instead. 

V. Properties & Implementation Details 

The GODDeS pseudo-code in Algorithm [T] specifies the 
instructions executing on each physical node, in an asyn- 
chronous and distributed manner. By design, GODDeS only 
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TABLE I 

Instantaneous Node Data Table For GODDeS 



Id. 


Neighbor # 


Current 
Measure 


Drop 
Probability 


Forwarding 
Decision 


h 


(Self) 1 


vo 


do = 

















m 


v m 


dm 


1 



1400 
1200 
1000 
" 800 
600 
| 400 
P ^ 200 





















ill 












■ Mean 






il 


P 


•Hi 










































■■■■■ 


i'I 


P 


















1 
















■ 


1 


■ 


















■ 
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10 3 



(a) 



10 ;i 




(b) 



Fig. 3. Convergence complexity: (a) illustrates little dependence of conver- 
gence on network size, (b) captures the 0(l/e) dependence 



uses information that is locally available, and global per- 
formance guarantees are achieved by propagating this local 
information via neighbor-neighbor communication. The idea 
of such information percolation in networks is not particularly 
new; the novelty of GODDeS lies in the exploitation of sound 
theoretical results from language measure theory to design 
such communication. The node-specific measure values com- 
puted by GODDeS essentially reflects a generalized distance 
vector, that takes in to account link-specific drop probabilities 
which update as network statistics (e.g. the drop probabilities) 
change (albeit at a slower time scale). Using the notion 
of quantitative measures of probabilistic regular languages, 
GODDeS successfully integrates the well-known notions of 
distance vector and link states into one single node-specific 
scalar; namely the measure at each node. Thus the amount of 
data that needs to be communicated is very small, implying a 
low communication overhead. Updating these measure values 
is also very simple, as stipulated in Algorithm [TJ Routing then 
proceeds by local multi-casting to neighbors which currently 
have a strictly higher measure; and our theoretical results 
guarantee that such a policy will essentially result in e-optimal 
global performance. Furthermore, as we show in Proposition|7] 
the optimal routing policy is inherently free from loops and 
the formidable count-to-infinity problem. 

Proposition 7 (Properties): The limiting GODDeS policy: 
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Fig. 4. Convergence dynamics: (a) rapid convergence to large random 
sink movements (b)robust response to large zero-mean variations in the drop 
probabilities (c) response to failure cascade where 50% of the nodes are killed 



1) is loop-free 

2) is the unique loop-free policy that disables the smallest 
set of transitions among all policies which induce the 
same measure vector for a given 9. 

Proof: (1) Absence of loops follows immediately from 
noting that, in the limiting policy, a controllable transition 
qi — > qV^ is enabled if and only if qj^.s has a limiting measure 
strictly greater than that of qi, implying that any sequence of 
transitions (with no consecutive repeating states) goes to either 
the dump or the sink in a finite number of steps. 

(2) follows directly from the uniqueness and the maximal 
permissivity property of optimal policies computed by lan- 
guage measure-theoretic optimization (See lfl4l ). ■ 

In this paper, we refrain from explicitly designing specific 
headers and data-structures that would be required for practical 
implementation of GODDeS. However one can easily tabulate 
the data that needs to be maintained at each node (See Table[I|. 
In particular, each node needs to know the unique network id. 
of each neighbor that it can communicate with (Col. 1), and 
their current measure values (Col. 3). The drop probabilities 




for communicating from self to each of those neighbors must 
be maintained as well, for the purpose of carrying out the 
GODDeS updates (Col. 4). The forwarding decision is a 
neighbor-specific Boolean value (Col. 5), which is set to 1 if 
the neighbor currently has a strictly higher measure than self, 
and otherwise. The packets are then forwarded by randomly 
choosing (in an equiprobable manner) between the enabled 
neighbors, i.e., the ones with a true forwarding decision. 
Note that this node data updates when the measures of the 
neighbors change (Col. 3), or the drop probabilities (Col. 4) 
update. However, changes in the measures may not necessarily 
reflect a change in the forwarding decisions. Also, note that 
the routing is inherently probabilistic, (due to the possibility 
that multiple enabled neighbors may exist for a given node). 
Furthermore, the optimal policy disables transmission to as 
few neighbors as possible for a specified 8 (Proposition|7|, and 
hence exploits multi-path transmissions in an optimal manner. 

In remote sensing applications nodes often have limited 
energy, necessitating route updates as high-traffic nodes get 
depleted. Also, local congestion arising due to the bursty na- 
ture of such communication may require re-routing. Note that 
congestion leads to higher packet drop probabilities, and gets 
reflected in the local link-specific drop probability estimations. 
Thus, GODDeS automatically corrects for network congestion 
to a large degree, by modulating the forwarding decisions as 
specific areas experience high traffic. However this does not 
correct for depleting energy levels (until the nodes actually 
die). Energy-aware reorganizations can be nevertheless carried 
out within the GODDeS framework autonomously and in a 
decentralized manner. Specifically, each node can regulate 
incoming traffic by deliberately reporting lower values of its 
current self-measure to its neighbors: 
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where Vg, £ Q,k £ [0, oo), Cfe? k) G [0, 1] is a multiplicative 
factor which is modulated to have decreasing values as node 



energy gets depleted, or as local congestion increases. Such 
modulation forces automatic self-organization to compute al- 
ternate routes that tend to avoid the particular node. The dy- 
namics of such context-aware modulation may be non-trivial; 
while for slowly varying ((qi,k), the convergence results 
presented here is expected to hold true, rapid fluctuations in 
C{qi,k) may be problematic. 

VI. Verification, Validation & Discussion 

Extensive simulations have been performed on NS2 network 
simulator, running on a 32 core (64 bit architecture) worksta- 
tion with 128 GB of RAM. We investigate how convergence 
times scale as a function of the network size in Figures [3|a-b). 
10 2 random topologies were considered for each N (increased 
from 25 to 1600), and the mean times along with the max- 
min bars are plotted in Figure [3|a). Note that the abscissa 
is on a logarithmic scale, and the near linear nature of the 
plot indicates a logarithmic dependence of the convergence 
on network size, implying that the bound computed in Propo- 
sition |6]is possibly not tight. The dependence on e shown in 
Figure]3jb) (for N = 10 3 ) is hyperbolic, as expected, leading 
to a near linear dependence after a smoothing spline fit on a 
log-log scale. Note the convergence times are not CPU times, 
but are estimated from NS2 output (using 802.11 standard). 

The theoretical convergence results are illustrated in Fig- 
ure Qa-c), which were generated on a 10 4 node network. Plate 
(a) illustrates the variation of the number of route updates 
(# of forwarding decision corrections) and the norm of the 
performance vector p p (scaled up by a multiplicative factor 
of 2) when the sink is moved around randomly at a slower 
time scale. Since p p is the vector of end-to-end success 
probabilities (See Definition 14 1, its norm captures the degree 
of expected throughput across the network. Note that sink 
changes induce self-organizing corrections, which rapidly die 
down, with the performance converging close to the global 
optimal (e = 0.001 was assumed in all the simulations). 
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Fig. 6. Time lapse plates for progressive node deaths: Top row indicates failed regions in black, middle row illustrates packet path signatures to the sink 
from operational nodes, and bottom row shows the level sets for the scalar field induced by the node measures 



The drop probabilities are chosen randomly, and, on the 
average, held constant in the course of simulation illustrated 
in plate (a) (zero mean Gaussian noise is added to illustrate 
robustness). Note that the seemingly large fluctuations in the 
performance norm is unavoidable; the interval r is the what it 
approximately takes for information to percolate through the 
network, and hence this much time is necessary at a minimum 
for decentralized route convergence. Plate (b) illustrates the 
effect of large zero-mean stochastic variations in the drop 
probabilities. Each node estimates the drop probabilities from 
simple windowed average of the link-specific packet drops. 
We note that large sustained fluctuations result in a sustained 
corrections in the forwarding decisions (which no longer goes 
to zero). However, the norm of the performance vector con- 
verges and holds steady, indicating a highly stable quality of 
service. This clearly illustrates that the information percolation 
strategy induces a low-pass filter eliminating high-frequency 
fluctuations, yielding a self-organizing routes that maintain 
high throughput in a robust manner. Note that small number 
of route fluctuations always occur (as shown by the non-zero 
number of corrections), but the key point is that this does 
not induce significant variations in the performance. Plate (c) 
illustrates the case where a cascading failure was simulated 
by turning off 50% of the nodes in the network. We measure 
the individual entries of p p for a pre-determined set of nodes, 
which lie at a maximal distance from the sink (and are not 
killed). Note that the expected throughputs stabilize before 
the cascade, and the routes rapidly reorganize due to the 



failure event, when the performance regains convergent values. 
The entire process is perfectly decentralized, with the nodes 
identifying dead or non-responsive neighbors, and updating 
both their set of possible neighbors (Col. 2 in Table ttj, and 
self measures. 

Convergence dynamics is explicitly illustrated in Figure [5] 
for a dense network of 10 4 nodes, placed on an uniform rect- 
angular grid (uniformity merely aids visualization). We see the 
gradual spreading out of the non-zero measure updates from 
the sink. The plates on top show the the gradient of the scalar 
field induced by the node measures, while those at the bottom 
illustrate the level sets. The voids are conglomerations of dead 
or non-responsive nodes. Other regions (marked "POOR") 
comprise of nodes that are experiencing poor communication. 
Note that the routes tend to avoid these regions. As before, the 
drop probabilities are chosen randomly, and held constant on 
the average with zero mean Gaussian noise. Also, note the two 
color tones illustrate the possibility of simple decentralized 
thresholding, to autonomously segregate the network to classes 
which have a certain degree of connectivity to the sink, based 
on the convergent value of the estimated measures. 

Progressive failures are simulated in Figure [6] addressing 
situations with gradual node depletions. Top row shows the 
failed regions in black. The network is initialized with 10 4 
nodes with energy levels distributed uniformly over a pre- 
specified range, leading to a realistic scenario, where nodes 
fail due to to various unmodeled effects in addition to energy 
spent in communication. Nodes are assumed to fail in clusters 
of ~ 10 2 creating dead regions. The middle row shows packet 
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traces to the sink from operational nodes, and the bottom row 
illustrates the level sets. Note that with small number of dead 
regions, we can see very little "white" in the middle row, 
indicating high route utilization and low congestion. As the 
nodes fail, we see more white space, indicating that most 
packets are now taking similar routes. Note that congestion 
leads to higher drop probabilities which are estimated on the 
fly, and incorporated via GODDeS in local decision-making, 
thus implying significant congestion-awareness. 

VII. Conclusions & Future Work 

This paper introduces GODDeS: a new routing algorithm 
designed to effectively exploit high quality paths in lossy ad- 
hoc wireless environments, typically with a large number of 
nodes. The routing problem is modeled as an optimal control 
problem for a decentralized Markov Decision Process, with 
links characterized by locally known packet drop probabilities 
that either remain constant on average or change slowly. The 
equivalence of this optimization problem to that of perfor- 
mance maximization of an explicitly constructed PFSA allows 
us to apply the theory of quantitative measures of probabilistic 
regular languages, and design a distributed highly efficient so- 
lution approach that attempts to minimize source-to-sink drop 
probabilities across the network. Theoretical results provide 
rigorous guarantees on global performance, showing that the 
algorithm achieves near-global optimality, in polynomial time. 
It is also argued that GODDeS is significantly congestion- 
aware, and exploits multi-path routes optimally. Theoretical 
development is supported by high-fidelity network simulation. 

Future work will proceed in the following directions, pri- 
marily aimed at investigating and consequently relaxing some 
of the key assumptions made in this paper: 

1) Design explicit strategies for energy and congestion 
awareness within the GODDeS framework. In particular, 
investigate the ramifications of various choices of the 
measure reduction factor described in Eq. ( [51) . 

2) Generalize the analysis to multiple sinks, which is 
not too difficult in view of the fact that most of the 
theoretical results carry over to the general case. 

3) We assumed that the link-specific drop probabilities are 
estimated at the nodes. Grossly incorrect estimations will 
translate to incorrect routing decisions, and decentralized 
strategies for robust identification of these parameters 
need to be investigated at a greater depth. 

4) Explicit design of implementation details such as packet 
headers, node data structures and pertinent neighbor- 
neighbor communication protocols. 



5) Hardware validation with networks of different sizes, 
and with induced failure situations. 

References 

[1] D. P. Bertsekas and R. G. Gallager, "Distributed asynchronous bellman- 
ford algorithm," in Data Networks. Prentice Hall, Englewood Cliffs, 
1987, ch. 5.2.4, pp. 325-333. 

[2] C. E. Perkins and E. M. Royer, "The ad hoc on-demand distance-vector 
protocol," in Ad Hoc Networking, C. E. Perkins, Ed. Addison-Wesley, 
2001, ch. 6, pp. 173-219. 

[3] , "Ad-hoc on-demand distance vector routing," in 2nd IEEE Work- 
shop on Mobile Computing Systems and Applications, New Orleans, 
USA. IEEE, February 1999, pp. 90-100. 

[4] D. B. Johnson, D. A. Maltz, and J. Broch, "Dsr: The dynamic source 
routing protocol for multihop wireless ad hoc networks," in Ad Hoc 
Networking, C. Perkins, Ed. Addison-Wesley, 2001, pp. 139-172. 

[5] A. Pandey, M. N. Ahmed, N. Kumar, and P. Gupta, "A hybrid routing 
scheme for mobile ad hoc networks with mobile backbones," in Inter- 
national Conference on High Performance Computing, IEEE. IEEE, 
December 2006, pp. 411-423. 

[6] G. Koltsidas, G. Dimitriadis, and F.-N. Pavlidou, "On the performance 
of the hsls routing protocol for mobile ad hoc networks," Wirel. Pers. 
Commun., vol. 35, no. 3, pp. 241-253, 2005. 

[7] P. Jacquet, P. Mhlethaler, T. Clausen, A. Laouiti, A. Qayyum, and 
L. Viennot, "Optimized link state routing protocol," in IEEE INMIC'01, 
Lahore, Pakistan, IEEE. IEEE, December 2001, pp. 62-68. 

[8] M. JoaNg and I. Lu, "A peer-to-peer zone-based two-level link state 
routing for mobile ad hoc networks," IEEE Journal on Selected Areas 
In Communication, vol. 17, no. 8, pp. 1415-1425, August 1999. 

[9] C. Intanagonwiwat, R. Govindan, and D. Estrin, "Directed diffusion: 
A scalable and robust communication paradigm for sensor networks," 
in Proceedings of the 6th annual international conference on Mobile. 
New York, USA: ACM Press, 2000, pp. 56-67. 
[10] D. S. J. D. Couto, D. Aguayo, J. Bicket, and R. Morris, "A high- 
throughput path metric for multi-hop wireless routing," Wireless Net- 
works, vol. 11, pp. 419^134, 2005. 
[11] B. Awerbuch, D. Holmer, and H. Rubens, "High throughput route 
selection in multi-rate ad hoc wireless networks," in WONS, 2004, pp. 
253-270. 

[12] D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein, "The 

complexity of decentralized control of markov decision processes," in 

Proceedings of the Sixteenth Conference on Uncertainty in Artificial 

Intelligence (UAI-2000), 2000, pp. 32-37. 
[13] , "The complexity of decentralized control of markov decision 

processes," Math. Oper. Res., vol. 27, no. 4, pp. 819-840, 2002. 
[14] I. Chattopadhyay and A. Ray, "Language-measure-theoretic optimal 

control of probabilistic finite-state systems," International Journal of 

Control, vol. 80, no. 8, pp. 1271-1290, Aug. 2007. 
[15] J. E. Hopcroft, R. Motwani, and J. D. Ullman, Introduction to Automata 

Theory, Languages, and Computation, 2nd ed. Addison-Wesley, 2001. 
[16] R. Bapat and T. Raghavan, Nonnegative matrices and Applications. 

Cambridge University Press, 1997. 
[17] W. Rudin, Real and Complex Analysis, 3rd ed. McGraw Hill, New 

York, 1988. 

[18] I. Chattopadhyay and A. Ray, "Optimal control of infinite horizon 
partially observable decision processes modeled as generators of prob- 
abilistic regular languages," International Journal of Control, vol. 83, 
no. 3, pp. 457^183, March 2010. 

[19] S. Gerschgorin, "Uber die abgrenzung der eigenwerte einer matrix," Izv. 
Akad. Nauk. USSR Old. Fiz.-Mat. Nauk, vol. 7, pp. 749-754, 1931. 

[20] R. S. Varga, Gerschgorin and His Circles. Germany: Springer, 2004. 



